Which Coreference Evaluation Metric Do You Trust? A Proposal for a Link-based Entity Aware Metric

نویسندگان

  • Nafise Sadat Moosavi
  • Michael Strube
چکیده

Interpretability and discriminative power are the two most basic requirements for an evaluation metric. In this paper, we report the mention identification effect in the B3, CEAF, and BLANC coreference evaluation metrics that makes it impossible to interpret their results properly. The only metric which is insensitive to this flaw is MUC, which, however, is known to be the least discriminative metric. It is a known fact that none of the current metrics are reliable. The common practice for ranking coreference resolvers is to use the average of three different metrics. However, one cannot expect to obtain a reliable score by averaging three unreliable metrics. We propose LEA, a Link-based Entity-Aware evaluation metric that is designed to overcome the shortcomings of the current evaluation metrics. LEA is available as branch LEA-scorer in the reference implementation of the official CoNLL scorer.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum Metric Score Training for Coreference Resolution

A large body of prior research on coreference resolution recasts the problem as a two-class classification problem. However, standard supervised machine learning algorithms that minimize classification errors on the training instances do not always lead to maximizing the F-measure of the chosen evaluation metric for coreference resolution. In this paper, we propose a novel approach comprising t...

متن کامل

On Coreference Resolution Performance Metrics

The paper proposes a Constrained EntityAlignment F-Measure (CEAF) for evaluating coreference resolution. The metric is computed by aligning reference and system entities (or coreference chains) with the constraint that a system (reference) entity is aligned with at most one reference (system) entity. We show that the best alignment is a maximum bipartite matching problem which can be solved by ...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

Evaluation of Coreference Resolution Tools for Polish from the Information Extraction Perspective

In this paper we discuss the performance of existing tools for coreference resolution for Polish from the perspective of information extraction tasks. We take into consideration the source of mentions, i.e., gold standard vs mentions recognized automatically. We evaluate three existing tools, i.e., IKAR, Ruler and Bartek on the KPWr corpus. We show that the widely used metrics for coreference e...

متن کامل

A Metric for Evaluating Discourse Coherence based on Coreference Resolution

We propose a simple and effective metric for automatically evaluating discourse coherence of a text using the outputs of a coreference resolution model. According to the idea that a writer tends to appropriately utilise coreference relations when writing a coherent text, we introduce a metric of discourse coherence based on automatically identified coreference relations. We empirically evaluate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016